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Conditional observed score (COS) and latent trait 
(LT) definitions of differential item functioning (DIF) are explored 
to determine when they are equivalent. COS methods rely solely on 
observed measurements, and LT methods model the response to an item 
as a function of an unobserved hypothetical latent ability or trait. 
For the case of dichotomous test items, the COS approach defines DIF 
by population differences i:\ the conditional probabilities of 
responding correctly to the item, conditioning on the observed 
ability measure. The LT approach defines DIF by population 
differences in the conditional probabilities of responding correctly 
to the Item, conditioning on the latent trait. Although DIF detection 
methods are usually applied to dichotomous ability or achievement 
test items, the discussion focuses on a more general level, and the 
results may have applications for general questions of measurement 
invariance in multiple populations. It is concluded that the 
cor.ditions under which COS and LT definitions are equivalent are 
quite specialized. Equivalence generally requires invariance of the 
conditional densities. Other conditions of equivalence are discussed. 
It appears that precise invariance of prior densities can rarely be 
assumed in practice. (TJH) 
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Methods for detecting differential item functioning iDlF) across 
multiple examinee populations have been under development for several 
decades. We can classify the available methods in two categories. The 
first category .ncludes methods that rely solely on observed 
measurements citem scores, subtest scores, or external measurements) for 
the detection of DIF. This category includes methods using transformed 
Item difficulties cAngoff, 198a; Angoff & Ford, 1973) and also methods 
that examine the conditional as5ociation between item scores and 
population membership within ability groups defined by an observed 
measure cHolland & Tiiayer, 1^86; Ironson, 1982; Marascuilo & Slaughter, 
1981; Scheuneman, 1979; Shepard, Camilli, & Averill, 1981). We will 
focus on these conditional methods, which will be denoted "conditional 
observed score" CCOS) methods. The second category includes methods 
which model the response to an item as a function of an unobserved, 
hypothetical latent ability or trait. Within tne hypothesized model, 
DIF IS defined by population differences in the function relating the 
Item response to the latent trait (Lord, 1980). Methods within this 
category will be denoted "latent trait" cLT) methods. 

COS and LT methods rest on different definitions ot DIF. For the 
case of dichotomous test items, the COS approach defines DIF by 
population differences m the conditional probabilities of responding 
correctly to the item, conditioning on the observed ability measure. 
The LT approach defines DIF by population differences in the conditional 
probabilities of responding correctly to the item, conditioning on the 
latent trait. When can these two definitions be considered equivalent 



in theoretical terms'? li the definitions are equivalent, the COS 
approach has < n advantage in eliminating the need for computer-intensive 
model-fitting and estimation. If the methods are not eqvivalent, 
investigators may reach diffeient conclusions regarding DIF depending on 
the method chosen. 

In the following, we explore the conditions under which the COS and 
LT definitions of DIF will be equivalent. Although DIF detection 
methods are usually applied to dichotomous ability or achievement test 
Items, the discussion will be at a more general level, and the results 
may have applications for general questions of measurement invariance in 
multiple populations. 

Conditions for Equivalence 

Let u, V, and w be three random variables, possibly vector-valued. 
We will assume w to be a latent variable ot interest, and u to be an 
observable variable that is intended to measure w. Concern for DIF will 
center on u. A second observable variable v wUl be used in studying 
the possible DIF in u through COS methods. The variables u, v, and w 
may be discrete or continuous, but our notation will treat all variables 
as continuous except where needed in discrete examples. 

Let h^(w) be the density function for w (the prior density j 
defined with respect to the ith population of interest, i=l..S. These 
populations will ordinarily be defined in terms of observed variables 
sucn as ethnicity, gender, or age. Let g^culw) oe the conditional 
density function for u given w in the ith population. We can also 
define the conditional density functions f^culv), t^cvlwj, 
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qi(v,u,w), and CjCuiv.w;. Finally, let d^Cu.vlwj tie the 
conditional joint density function tor u and v given w. 

We define measurement invariance or lack of DIF to hold for u as a 
measure of w when 

for \zl.,S, If u IS a dichotomous test item, this definition is 
identical to the LT definition of DIF if the notation is altered to 
reflect the discrete nature of u: 

g^eu:wj = p^cu=i;wj - P(u-i;w). 
Invariance holdi when the item characteristic curves are identical among 
the populations of inl' 'est (Lord, 1980). If u is continuous, we could 
define weaker forms of invariance than that given in cD- For example, 
if u IS a vector of observed measures and w is a vector of factor scores 
within the common factor model, the usual definition of factorial 
invariance would involve only the conditional first and second moments 
of u (Meredith, 1964a, 1964JD). 

COS definitions of DIF employ the conditional density fj(u;v). 
'Ve can define COS invariance for u with respect to v as 

fi(u;v) = feu:vj ea 

for i=l,.S. If u IS a dichotomous test item and v is the unweighted 
total test score (.possibly omitting u), the definition in (2) reduces to 
the null hypothesis examined in most chi-square-based COS procedures if 

fi(u:vj = Pi(u=i:v) = Pcu=i:v). 

When will measurement invariance as defined in (1) be equivalent to 
invariance as defined in {2)^^ Is it possible to have invariaiice in 
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gculw) jDut not in i{uyp is the converse possible? 

First, It can be easily shown that the two definitions need not be 
equivalent generally. Assume that invariance holds as in (1). Also 
assume local independence between u and v with respect to w. In other 
word^, If djcu.vlwj is the conditional joint density for u and v, we 
have 



From this equation, it is clear that c^J will generally hold only if the 
product ti(v:w) hjcw) is invariant. in particular, population 
differences in the prior densities h^cwj can result in differences 
in the conditional densities f^culv). 

A practical example of this sort would occur it u is a dichotomous 
Item score in a test containing p items, v is the sum oi the remaining 
p-1 Item scores on the test, and all items follow the Rasch model with 
latent trait w. in this case, u and v are locally independent with 
respect to w. Invariance in 2) may not hold even if the condition in 
(1) does hold. 

As this example illustrates, local independence between u and v 
with respect to w is an important consideration in evaluating the 
equivalence of cD and (2j. Equation 4 suggests that the two 
definitions need not be equivalent when u and v are locally independent. 
Clearly, special cases may exist in which the two definitions are 



d^(,u,v;wj z gcu:wj tjCviw). 



Then we can express f^culvj as 
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equivalent in spite oi local independence between u and v. For example, 
If JDoth the prior densities h^cw) and the conditional densities 
tjCvlw) are invariant, the definitions are equivalent. 

Suppose that local independence does not hold for u and v. Then 
(4) can be written 

j'c^(.u;v. wj tjCViw; h^tw) dw 

fi(u:v; (^5 

\ t^cviw; h^fw) dw 

Now consider the special case in which v is a sufficient statistic for 

w. Then c,(u:v,w) = f^Cuiv). and this conditional density does 

not involve w. Population differences in the prior densities h^Cw) 

yill not prevent invariance in fi(u;v). Suppose that (1) also 

holds. Must C^) then hold in this case^ 

The answer is no, not in general. Given sufficiency, we know that 

g(.u;w) q^Cviu^w; 

Cjtuiv.w; fjCulv) = C6 

t^evlw; 

Note that we cannot have q^cviu.wj = t^Cvlw) because this 
implies local indepenaence of u and v. Then c^; will hold only if the 
ratios 

q^evlu, w; 
tjCviw; 

are also invariant for i=l..S. 

As an example, suppose that u is a dichotomous item score, v is the 
sum of p Item scores including u, and all ite.Tis follow a Rasch model 
with latent trait w. Then v is sufficient for w, but u and v are not 
locally independent. In this case, 
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qjCVlu^w) = qievp_^;w), 
where Vp^j^ is the sum of the p-1 items excluding u. Then the 
fjCulv) are invariant if the ratios 

are invariant. Since g(u;w) is assumed invariant, the ratios are 
invariant if the numerators of the ratios are invariant, or (1) holds 
for the p-1 Items excluding u. 

In the above case, we have assumed that v is a sufficient statistic 
for w. Suppose that v is not sufficient for w. If local independence 
does not hold for u and v, when are definitions U) and (2) equivalenf^ 
In this case. (5) does not reduce to any simple form in general. The 
definition in c2) generally holds only if the prior densities are 
invariant, and if c^eulv.wj and tjCviwj are both invariant. 
Again it is possible that for some specific choices of prior densities, 
Cjfulv.w) and tj(v;w). the two definitions can be made 
equivalent. 

A familiar example of the general case occurs when u is a 
dichotomous item score, v is the unweighted sum of p item scores 
including u, and all items follow a two-parameter logistic model with 
latent trait w. Local independence does not hold between u and v, and v 
IS not sufficient because it is an unweighted sum. Algebraically, it 
can be shown that if (1) holds, (2) generally holds only if (1) also 
holds for the p-1 items excluding u and if the prior densities are 
invariant. 



7 

The foregoing results show that when u and v are not locally 
independent, the sufficiency of v with respect tc w is an important 
consideration in determining the equivalence of definitions ci) and (2). 
When sufficiency holds, equivalence does not require in variance of the 
prior densities h^cw). If v is not sufficient for w, population 
differences in these densities will generally prohibit the equivalence 
of (1) and (2). 

Conclusion 

The conditions under which the COS and LT definitions are 
equivalent are quite specialized. First, equivalence generally requires 
invariance of the conditional densities tj(v;w). In the COS 
approach, this entails careful selection of the observed measure v to 
avoid diffential functioning in this measure. This fact is generally 
recognized (Ironson, 198^). Secondly, the precise conditions for 
equivalence depend on both the local independence of u and v, and th^ 
possible sufficiency of v for the latent trait w. If u and v are 
locally independent, equivalence generally requires invariance in the 
prior densities and in che conditional densities tjCviwj. Note that 
local independence of u and v precludes sufficiency of v for w in the 
cases of interest. But if local independence does not hold, the 
sufficiency of v for w is important. Given sufficiency, the equivalence 
of (1) and (2) does not require invariance of the prior densities. 

In practical applications, v is typically an unweighted sum of item 
scores. If local independence can be assumed among these items with 
respect to a latent trait w, local independence between u and v simply 

er|c ^ 
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depends on whether u is included in the summation leading to v. There 
IS an advantage to including u in the summation for v, thereby removing 
the local independence. This point was demonstrated by Holland and 
Thayer (1986) in a slightly different context. On the other hand, an 
unweighted sum of item scores will be sufficient for w only when the 
Items fit the Rasch model with latent trait w, or when all items have 
identical discrimination parameters. Hence considerations of 
sufficiency may have limited practical value. 

If V IS not sufficient for w, population differences in the prior 
densities h^tw) will generally prevent the equivalence of (1) and 
(.iB). Precise invariance of the prior densities can rarely be assumed in 
practice. Since sufficiency of v is also unusual, we must conclude that 
formal equivalence between U) and c^) will be the exception, rather 
than the rule. 
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